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Double-Talk and Path Change Detection Using A Matrix of Correlation Coefficients 
BACKGROUND OF THE INVENTION 
Field of Invention 

This invention relates to a method of detecting double-talk and path changes in echo 
5 cancellation systems. Echo cancellation is used extensively in telecommunications 

applications to recondition a wide variety of signals, such as speech, data transmission, 
and video. 

Brief Description of the Prior Art 

The search for an effective echo cancellation procedure has produced several different 
10 approaches with varying degrees of complexity, cost, and performance. A traditional 
approach to echo cancellation uses an adaptive filter of length L, where L equals the 
number of samples necessary to extend to just beyond the duration of the echo. Typically, 
the adaptive filters contain either 512 or 1024 taps. At the standard telephone bit rate of 
8000 samples per second, this provides the ability to adapt to echo paths as long as 64 ms 
15 and 128 ms, respectively. 

The computational requirements of an adaptive filter are proportional to L for the popular 
LMS (Least Mean Squares) class of algorithms, and proportional to L 2 or higher for 
algorithms such as RLS (Recursive Least Squares). More robust algorithms (like RLS) 
have greatly improved convergence characteristics over LMS methods, but the L 2 
20 computational load makes them impractical with current technology. For this reason, the 
LMS algorithm (and its variants) tends to remain the algorithm of choice for echo 
cancellation. 

Practical echo cancellation devices must provide some means of avoiding divergence 
from double-talk. The double-talk condition arises when there is simultaneous 
25 transmission of signals from both sides of the echo canceller due to the presence of near- 
end speech in addition to the echo. Under such circumstances, the return echo path signal, 
Sin (see Figure 1), contains both return echo from the echo source signal, and a double- 
talk signal. The presence of a double-talk signal will prevent an LMS-based echo 
canceller from converging on the correct echo path. It will also cause a pre-converged 
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echo canceller to diverge to unpredictable states. Following divergence, the echo canceller 
will no longer cancel the echo, and must reconverge to the correct solution. Such 
behaviour is highly unacceptable, and is to be avoided in actual devices. Some means of 
detecting double-talk must therefore be implemented. To prevent divergence, the LMS 
5 filter coefficients are typically frozen during the presence of double-talk. 

Detecting double-talk quickly and reliably is a notoriously difficult problem. Even a small 
amount of divergence in a fully converged LMS filter will result in a significant increase 
in the residual echo level. The use of a fast and reliable double-talk detector is crucial to 
maintain adequate subjective performance. 

10 The simplest, and perhaps most common, method for detecting double-talk is to use 

signal levels. The echo path typically contains a minimum amount of loss, or reduction, in 
the return signal. This quantity is often referred to as the Echo Return Loss, or ERL. In 
most systems, this is assumed to be at least 6 dB. In other words, the return signal Sum will 
be at a level which is at least 6 dB lower than Rout provided that there is no double-talk. 

15 In the presence of double-talk, the level at Sin often increases so that it is no longer 6 dB 
lower than Rout- This condition provides a simple and convenient test for double-talk. 

The problem with this approach is that the double-talk detector must have an accurate 
estimate of the echo path ERL in order to determine if the level at Susr is too high. 
However, precise knowledge of the ERL is generally not available. If the ERL estimate is 
20 too high, the double-talk detector may trigger unnecessarily. Conversely, it may not 
trigger at all if the ERL estimate is too low. 

Another problem with this technique is that it will only reliably detect high-level double- 
talk. If the double-talk signal is at a much lower level than the echo source signal, low- 
level double-talk occurs. Under this condition, the increase in the level of Sin is usually 
25 very small. The double-talk detector may fail to trigger, but noticeable divergence in the 
LMS filter can still occur. 

To detect low-level double-talk, the level of the residual echo signal (Sout) is often 
monitored. If no double-talk or background noise is present, and the LMS filter is fully 
converged, Sour can be as much as 40 dB lower than Rout- Assuming that the echo path 
30 remains constant, any increase in Sout will likely be due to double-talk. Of course, if the 
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echo path does change, it will be mistaken for double-talk. So if this method is used, a 
separate path change detection algorithm must be employed. A unified approach would be 
simpler and preferred. 

Correlation is a statistical function which is commonly used in signal processing. It can 
5 provide a measure of the similarity between two signals (cross-correlation), or a single 
signal and time-shifted versions of itself (autocorrelation). The use of correlation for 
double-talk detection per se is known. Several patents exist for correlation-based double- 
talk detection, including US5646990, US5526347 and US5193112. The correlation-based 
approaches taken in prior-art methods generally involve the calculation of a single cross- 
10 correlation coefficient, usually between and S^. The problem with this technique is 
that the degree of correlation can vary widely with different signals and echo paths. This 
makes it very difficult to set thresholds on the correlation coefficient in order to determine 
what state the echo canceller is in. 

SUMMARY OF THE INVENTION 

15 A process has been developed which generates matrix coefficients using zero-lag auto and 
cross-correlations from signals commonly found in echo cancellers. Double-talk and path 
changes are then detected using matrix operations such as determinants, 
eigendecompositions, or singular value decompositions (SVDs). 

The correlations between various signals in an echo canceller will change depending on 
20 what state the echo canceller is in, i.e. if it is converged, unconverged, or in double-talk. 
By arranging the various correlations in appropriate matrix form, key information about 
the state of the echo canceller can be extracted by performing various matrix operations. 
The preferred operation is to take the determinant , but eigendecompositions and singular 
value decompositions (SVDs) can also be used. A novel aspect of the invention is the 
25 formulation of a matrix using various correlation coefficients, and the subsequent analysis 
of this matrix to determine the state of the echo canceller. 

Accordingly the present invention provides a method of detecting double-talk and path 
changes in an echo cancellation system, comprising generating a correlation-based matrix 
of signals in said echo cancellation system; and analyzing said correlation-based matrix to 
30 identify double-talk and path changes occurring in said system. 




In the preferred embodiment, the correlation-based matrix is generated using the return 
echo signal (Sin) and the output of an LMS adaptive filter. 

The invention provides a correlation-based matrix is generated using zero-lag auto and 
cross-correlations of signals commonly found in echo cancellers. 

5 Double-talk and path changes are detected by analysis of the correlation-based matrix. 
Possible analysis techniques include condition numbers, determinants, 
eigendecompositions, and singular value decompositions. 

In the preferred embodiment, determinants are used to detect double-talk and path 
changes. 

10 The invention can be implemented using either the time-domain or frequency-domain in a 
digital signal processor using conventional digital signal processing techniques. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention will now be described in more detail, by way of example only, with 
reference to the accompanying drawings, in which;- 

15 Figure 1 is a schematic diagram of an echo canceller using LMS Adaptive Filtering; and 

Figure 2a is a-plot showing the value of det [R] under normal convergence; 

Figure 2b is a plot showing the value of det [R] with a path change; and 

Figure 2c is a plot showing the value of det [R] with double-talk. 

DETAILED DESCRIPTION OF THE INVENTION 

20 The layout of a typical LMS-based echo canceller is shown in Figure 1. It contains two 

signals, travelling along a "send" path and a "receive" path. The echo source signal enters 
the echo canceller as Rin and leaves as Rout- The send path input, Sin, consists of a 
double-talk signal (if present) plus the echo source signal after it has travelled along the 
echo path. By estimating the echo path, a synthetic echo signal can be generated to cancel 

25 the echo in the send path. The echo cancelled signal leaves as Sour- 

The LMS filter attempts to cancel the echo by adjusting itself to suppress the output signal 
at Squt. Obviously, if Sin contains components other than echoed speech from the echo 
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source, the LMS filter will not converge to the correct solution; hence the need for 
double-talk detection. 

The preferred embodiment of the algorithm for this patent uses the Normalized-LMS (N- 
LMS) algorithm. Mathematically, the adaptive filter tap-weight update procedure for the 
5 N-LMS algorithm consists of the following three equations 

d[n] = w [n]u[n] 
e[n] = d[n] -d[n] 

w[n+l] = w[n] + ^ -u[n]e[n] 

a + |u[n]|| 

where 

10 u l n ] = = ec ho source signal 

w [ n ] = LMS filter coefficients 

d ^ n ^ = Sin = desired LMS output (echo + double-talk) 
d t n ] = LMS output (estimated echo) 
e [ n ] = Sout = LMS error signal 
15 |i = LMS step-size parameter 

a = A small constant (provides numerical stability). 

The location of these signals in the echo canceller is also shown in Figure 1. The N-LMS 
algorithm well known to persons skilled in the art and a more detailed treatment is readily 
available in most adaptive filtering texts. See, for example, S. Haykin, Adaptive Filter 
20 Theory, Prentice-Hall, Upper Saddle River, NJ ( 1 996). 

One of the key parameters in the N-LMS algorithm is the LMS step-size parameter \l. 
This parameter controls both the speed and accuracy of convergence. The larger ji is, the 
faster the algorithm will converge on the echo path, but the less accurate the steady-state 
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solution will be. To guarantee convergence of the N-LMS algorithm, |i must be less 
than 2. 

A common technique is to adjust the value of \i based on the state of the echo canceller. In 
an unconverged state (such as at start-up, or following a path change), it is desirable to use 

5 a large value for |X to permit rapid initial convergence. Once the LMS filter has achieved a 
reasonable degree of convergence, \i can be reduced. This not only allows for a slightly 
more accurate solution (and therefore more cancellation), but it will also slow potential 
divergence from double-talk. To stop adaptation altogether, [i can simply be set to zero. 
The double-talk and path change detectors can therefore control the operation of the LMS 

1 0 filter by varying the value of p.. 

A double-talk detection algorithm in accordance with a preferred embodiment of the 
invention that is designed to work in conjunction with the echo canceller described 
illustrated in Figure 1 will be described. This is implemented in a digital signal processor. 

Consider two signals, X 0 [n] and Xi[n] generated by a linear combination of two real- 
1 5 valued source signals, S 0 [n] and S i [n] . Mathematically, this mixing process may be 
described as 

where H UJ are the mixing coefficients. In matrix form, this may be written as 
X = H S 

20 where 

-El-fc S— H 

A matrix R is defined as 

where E[.. .] is the statistical expectation operator. R may be expanded in two ways 



6 







= ^[HSS r H r ] 



From the first expansion, it is apparent that the diagonal terms in R are the zero-lag 
autocorrelations of Xo[n] and Xi[n] and that both off-diagonal terms correspond to the 
5 zero-lag cross-correlation between X 0 [n] and Xi[n]. Hence, R is a symmetric, 
correlation-based matrix. 

From the second expansion, we see that if H is full-rank, then R will also be full-rank if 
So[n] and Si[n]are both non-zero and uncorrelated. In most cases, a sufficient condition 
for this is that So[n] and Si[n] are different signals from different sources. 

1 0 The way in which the matrix can be used to perform double-talk and path change 

detection will now be explained. First, suppose we generate the signal mixtures in using 
convolutions: 



Now the terms in the mixing matrix can be vectors. We further impose the condition that 
1 5 H have the following form: 



With H defined in this way, it is now possible to connect the terms in the preceding 
equations with the parameters available in the echo canceller layout shown in Figure 1. 



X = H®S 




Let 



20 



So = echo source signal = Rdm = u[n] 



Si = double-talk signal 



Ho,o - echo path 



Hi f0 - LMS filter coefficients = \v[n] 



With these definitions, it is apparent that 
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The question of what happens to R under the various states of echo canceller operation 
will now be examined. 

5 Case 1: Unconverged, no double-talk 

If the LMS filter is in an unconverged state, Ho, 0 * Hi f0 . This situation occurs when the 
echo canceller is first started, or following a major echo path change. Since the LMS filter 
does not contain an accurate echo path estimate, Xo * X 1} and R will be full rank (unless 
Hi,o = 0, but this condition is usually temporary) with a very low condition number. See, 
10 for example, G. H. Golub and C F. Van Loan, Matrix Computations, 3rd ed., Johns 

Hopkins University Press, Baltimore, MD (1996). (k-10 1 ). As convergence proceeds, the 
degree of correlation between X 0 and Xi and increases. This has the effect of rapidly 
raising the condition number of R. As a result, the determinant of R will fall, and its 
eigenvalues and singular values will become increasingly disparate. 

15 Case 2: Converged, no double-talk 

In this state Ho,o ~~ Hi (0 ., so~Xo = X\ : This will make R very nearly rank deficient, and its ~ 
condition number very large (k~10 6 ). Since R is close to being singular, its determinant 
will become very small. Similarly, we would expect to find only one significant 
eigenvalue or singular value. 

20 Case 3: Double-talk 

When double-talk is occurring, Xo contains components from both S 0 and Si, while Xi is 
derived solely from S 0 . In this case, Xi and X 0 and are highly uncorrected. R will have a 
low condition number, and this will be sustained for the duration of the double-talk. The 
higher the double-talk level, the lower the condition number becomes. This will raise the 
25 determinant of R, and we will find two significant eigenvalues and singular values. 

Once the matrix R is generated, a variety of operations are available to determine what 
state the echo canceller is in. The condition number, determinant, eigenvalues and 
singular values of can all be used to test for double-talk or path changes. The determinant 
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is used in the preferred embodiment because it is the simplest matrix operation to 
perform. 

To illustrate the effectiveness of this algorithm at detecting double-talk and path changes, 
simulations were carried out and the results are shown in Figure 2. The plots indicate the 
5 value of det [R] under normal convergence, a path change, and double-talk. The scaling 
of the y-axis on the plots clearly demonstrates the variations observed in det [R] under the 
three different states. The simulations were carried out using ITU CSS synthetic speech 
signals from the G.168 Digital Echo Canceller standard. ITU-T Recommendation G.168, 
Digital Echo Cancellers. The signals were 48000 samples long, and a 60 ms echo path 
10 was used (which was changed to 15 ms during the path change simulation). 

Under normal convergence (Fig. 2a), det [R] rapidly decays to near-zero values. When a 
path change occurs (Fig. 2b), det [R] spikes to a large value and then decays (to 
emphasize this trend, convergence was slowed by a factor of 10 following the path 
change). With double-talk (bottom plot), even larger, but sustained, spikes are present in 
15 det [R]. The differences in these three plots make it very easy to tell what state the echo 
canceller is in simply by checking the level of det [R]. The highest levels indicate double- 
talk, medium levels (along with decay) occurs with path changes, and very low levels are 
characteristic of full convergence. Based on these results, thresholds can be set as follows: 

- Normal (converged) operation. 
20 - Path change detected. 

- Double-talk detected. 

Once the state of the echo canceller is determined, the LMS filter operation can be 
adjusted accordingly. 

A well-known relation in signal processing is that the convolution of two signals in time 
25 is equivalent to the multiplication of their frequency spectra. This property makes it 
possible to propose a variation on the preceding algorithm in which frequency-domain 
versions of the signals are used. X has been defined in the time-domain using 
convolutions: 

X[n] = H[n] <g>S[n] 
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By taking the Fourier Transform of all terms involved, it is possible to rewrite the above 
equation in the frequency-domain as 

X(t k ) = H(f k )-S(f k ) 

for all frequencies in the range 0 < f k , < Fs/2 where F s is the sampling frequency of the 
5 signals. The generation and analysis of the correlation-based matrix R is carried out as 

before, only now R is created using the frequency-domain version of X. 

The advantage to this approach is that the algorithm no longer needs to have an accurate 

echo path estimate for R to have a high condition number during non-double-talk states. 

The double-talk detector becomes completely insensitive to path changes. Depending on 
1 0 the application, this may or may not be a desirable property. Low-level double-talk 

detection abilities improve, but a separate path change detection scheme must now be 

used. 

Implementation of a frequency-domain version of this process can be accomplished 
through the use of Fast Fourier Transforms (FFTs) or subbanding techniques. 
15 As will be understood by persons skilled in the art the inventive process can be 

implemented in a digital signal processor or other suitable digital signal processing 
" device. " _ ~ ~ - _ _ _ _. - 



Glossary 

20 Adaptive Filter: A filter whose coefficients can be adjusted during operation. Adaptive 
filters are used to estimate unknown parameters, for example an unknown echo path. 
Autocorrelation: A statistical quantity which roughly measures the similarity of a signal 
to time shifted versions of itself. 

Condition Number: A measure of how close a matrix is to being singular. The condition 
25 number for an arbitrary matrix A is given by K (^> " I^Wa I 
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Convergence: The condition achieved when the LMS filter has accurately modelled the 
echo path and is no longer undergoing significant changes. At convergence, the LMS 
filter is cancelling the maximum amount of echo. 

Cross-Correlation: A statistical quantity which roughly measures the similarity of two 
5 separate signals. 

Divergence: The process by which the LMS filter coefficients move away from the actual 
echo path to erroneous and unpredictable solutions. During divergence, the amount of 
echo being cancelled becomes less and less. 

Double-Talk: The condition which occurs during simultaneous transmission of signals 
10 from both sides of the echo canceller. 

Echo Path: A mathematical description of the process which imparts an echo to a signal. 
ERL: Echo Return Loss. The loss a signal experiences as it travels along the echo path 
from Rout to Sin- 

ERLE: Echo Return Loss Enhancement. A common method of measuring the 
1 5 performance of an echo canceller. This measurement represents the amount that an echo 
signal has been reduced from Sin to Sout- 

LMS Algorithm: Least Mean Squares algorithm. Common adaptive filtering technique. 
N-LMS Algorithm: Normalized Least Mean Squares algorithm. A variation on standard 
LMS in which the tap- weight update term is scaled by the inverse of the input signal 
20 power. 

Rank: The number of non-zero eigenvalues or singular values a matrix has. Full-rank 
matrices have a non-zero determinant, and are thus non-singular and invertible. 
RLS Algorithm: Recursive Least Squares algorithm. Common adaptive filtering 
technique. 
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Claims: 

1 . A method of detecting double-talk and path changes in an echo cancellation 
system, comprising: 

generating a correlation-based matrix of signals in said echo cancellation system; 

5 and 

analyzing said correlation-based matrix to identify double-talk and path changes 
occurring in said system. 

2. A method as claimed in claim 1, wherein said correlation-based matrix is 
generated using zero-lag auto and cross-correlations of said signals. 

10 3. A method as claimed in claim 2, wherein a determinant of said matrix is used to 
detect said double-talk and path changes. 

4. A method as claimed in claim 3, wherein said double-talk and path changes are 
inferred when the value of said determinant passes predetermined threshold values. 

5. A method as claimed in claim 2, wherein eigendecompositions of said matrix are 
1 5 used to detect said double-talk and path changes. 

6. A method as claimed in claim 2, wherein single valued decompositions of said 
matrix are used to detect said double-talk and path changes. 

7. A method as claimed in claim 2, wherein condition numbers of said matrix are 
used to detect said double-talk and path changes. 

20 8. A method as claimed in any one of claims 1 to 7, wherein said echo cancellation 
system includes an adaptive filter, and said signals comprise an echo signal and an output 
of said adaptive filter. 

9. A method as claimed in claim 7, wherein said filter is an LMS filter. 

10. A method as claimed in claim 9, wherein said LMS filter implements a 
25 normalized-LMS algorithm. 

11. A method as claimed in any one of claims 1 to 10, wherein the elements of said 
correlation-based matrix are generated in the time domain. 
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12. A method as claimed in any one of claims 1 to 10, wherein the elements of said 
correlation-based matrix are generated in the frequency domain. 

13. A method as claimed in claim 3, wherein said determinant R is of the form 



wherein XoM and X,[n] are generated by a linear combination of two real-valued source 
signals, S 0 [n] and Si[n]. 

14. A method as claimed in claim 1, wherein S 0 [n] comprises an echo signal and S,[n] 
comprises a cancellation signal. 
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ABSTRACT OF THE DISCLOSURE 

A process is described which generates matrix coefficients using zero-lag auto and cross- 
correlations from signals commonly found in echo cancellers. Double-talk and path 
changes are then detected using matrix operations such as determinants, 
eigendecompositions, or singular value decompositions (SVDs). In a preferred 
embodiment, the determinant of the correlation-based matrix is compared against 
predetermined threshold values. 



14 




Double-Talk 



Send-Path 



Figure 1 



s F ag© Blank (uspto) 



X* 



.-i 

0 




